Learning Objectives

Geospatial data types

There are two main classes of geospatial data:

  1. Vector data, which includes points, lines, and polygons.

  2. Raster data, which includes grids of cells (like pixels in a picture) such as satellite imagery data. Each cell is the same size.

CRS

There are many ways to plot geospatial data, and many coordinate reference systems (CRS) although only a handful are very common. The two most common ones that use degrees of longitude and latitude that you will encounter in the U.S. are the World Geodetic System 1984 (WGS84) and North American Datum 1983 (NAD83). You should also be familiar with Universal Transverse Mercator (UTM) systems and converting among other types of CRS if you would like to work in a GIS-heavy field. For this class, we will stick to geographic coordinate systems (latitude and longitude degree based).

Generally speaking, it may not be necessary to convert between NAD83 and WGS84 in the US since they often differ by a meter or less. However, it is good practice to convert and to track your CRS usage and presentation. For example, in this exercise we will use a Michigan 1800 land cover file, which you can see uses WGS84 in its metadata file.

Geospatial Data Basics

### Not all of these packages are needed for the exercise but you may find useful tools in them for the future
library(mapview)
library(sp)
library(sf)
library(raster) 
library(leafem)
library(rgdal)
library(dataRetrieval) # USGS data retrieval package. In this case, we will use it to access the Water Quality Portal
library(RColorBrewer) # Color palette for our map

Let’s first read in our 1800 Land Cover shapefile. A shapefile contains vector geospatial data. In this case, we are working with polygons that represent estimated historic vegetative land cover types in Michigan. Note that even though we are only bringing in the shapefile extension (.shp), you also need to have the other associated files like the .dbf (database) and .prj (projection) files in the same directory. This holds true for both GIS in R and for platforms like ArcGIS.

LC1800 <- st_read(file.path("Land_Cover_Circa_1800.shp"))
## Reading layer `Land_Cover_Circa_1800' from data source 
##   `C:\Users\hanlypat\Desktop\FW472_LA2\Land_Cover_Circa_1800.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 74405 features and 14 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: -90.41866 ymin: 41.69613 xmax: -82.41308 ymax: 48.23858
## Geodetic CRS:  WGS 84

You can see that this file contains 74,405 polygons with 14 fields of data. Most of these fields describe the geometries of the polygons, but we also have the county and vegetative cover (‘COVERTYPE’) information as well.

colnames(LC1800)
##  [1] "OBJECTID"   "VEGCODE"    "LEGCODE"    "COVERTYPE"  "CNTY_NAME" 
##  [6] "CREATE_UID" "CREATE_DTS" "UPDATE_UID" "UPDATE_DTS" "GENERALIZE"
## [11] "SHAPE_STAr" "SHAPE_STLe" "ShapeSTAre" "ShapeSTLen" "geometry"

Let’s check out the county names since we don’t need to use all of this data.

unique(LC1800$CNTY_NAME)
##  [1] "WEXFORD"        "WAYNE"          "WASHTENAW"      "VAN BUREN"     
##  [5] "TUSCOLA"        NA               "SHIAWASSEE"     "SCHOOLCRAFT"   
##  [9] "SANILAC"        "SAGINAW"        "ROSCOMMON"      "PRESQUE ISLE"  
## [13] "OTTAWA"         "OTSEGO"         "OSCODA"         "OSCEOLA"       
## [17] "ONTONAGON"      "OGEMAW"         "OCEANA"         "OAKLAND"       
## [21] "NEWAYGO"        "MUSKEGON"       "MONTMORENCY"    "MONTCALM"      
## [25] "MONROE"         "MISSAUKEE"      "MIDLAND"        "MENOMINEE"     
## [29] "MECOSTA"        "MASON"          "MARQUETTE"      "MANISTEE"      
## [33] "MACOMB"         "MACKINAC"       "LUCE"           "LIVINGSTON"    
## [37] "LENAWEE"        "LEELANAU"       "LAPEER"         "LAKE"          
## [41] "KEWEENAW"       "KENT"           "KALKASKA"       "KALAMAZOO"     
## [45] "JACKSON"        "ISABELLA"       "IRON"           "IOSCO"         
## [49] "IONIA"          "INGHAM"         "HURON"          "HOUGHTON"      
## [53] "HILLSDALE"      "GRATIOT"        "GRAND TRAVERSE" "GOGEBIC"       
## [57] "GLADWIN"        "GENESEE"        "EMMET"          "EATON"         
## [61] "DICKINSON"      "DELTA"          "CRAWFORD"       "CLINTON"       
## [65] "CLARE"          "CHIPPEWA"       "CHEBOYGAN"      "CHARLEVOIX"    
## [69] "CASS"           "CALHOUN"        "BRANCH"         "BERRIEN"       
## [73] "BENZIE"         "BAY"            "BARRY"          "BARAGA"        
## [77] "ARENAC"         "ANTRIM"         "ALPENA"         "ALLEGAN"       
## [81] "ALGER"          "ALCONA"

We are going to only work with two lakes in this exercise. Lake Lansing in Ingham county and Lake Ovid in Clinton county. You will work later with Lake Ovid, but here let’s let’s filter to just Ingham county and remove NAs. We can filter and manipulate the data table associated with the shapefile just as we did with our non-spatial data in the last activity.

LC1800_Ing <- LC1800[LC1800$CNTY_NAME == "INGHAM", ] # Filter to Ingham County
LC1800_Ing <- LC1800_Ing[!is.na(LC1800_Ing$COVERTYPE),] # Remove NA cover type
LC1800_Ing$VEGCODE <- as.numeric(LC1800_Ing$VEGCODE) # Make sure the vegetation code is represented as numeric data

We can then check out the unique vegetative covertypes in these counties. Note that we also have Lake/River and some other aquatic habitats such as marsh and bog.

unique(LC1800_Ing$COVERTYPE)
##  [1] "WET PRAIRIE"                "BEECH-SUGAR MAPLE FOREST"  
##  [3] "MIXED HARDWOOD SWAMP"       "OAK-HICKORY FOREST"        
##  [5] "MIXED CONIFER SWAMP"        "LAKE/RIVER"                
##  [7] "SHRUB SWAMP/EMERGENT MARSH" "BLACK OAK BARREN"          
##  [9] "MIXED OAK FOREST"           "MUSKEG/BOG"                
## [11] "MIXED OAK SAVANNA"          "BLACK ASH SWAMP"

Let’s also bring in our Lake Lansing shapefile. Instead of a shapefile with many features as above, this is just a single polygon (a type of vector).

Lansing <- st_read("Lake_Lansing.shp")
## Reading layer `Lake_Lansing' from data source 
##   `C:\Users\hanlypat\Desktop\FW472_LA2\Lake_Lansing.shp' using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 4 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: -84.41154 ymin: 42.74912 xmax: -84.39063 ymax: 42.76718
## Geodetic CRS:  WGS 84

We can see that both the vegetation data and the Lake Lansing shapefile use the WGS84 Coordinate Reference System so no transformations are needed.

Let’s first just check out what this looks like on a map using the mapview package and function. We specify that the “Z” data, or third dimension, we want to plot is the covertype and then we use a pre-made color palette to color our 15 different covertypes. We could individually specify colors that might look better, but let’s just see what the covertypes look like for now. We can layer mapviews on top of one another by layering maps with “+” between mapview functions:

mapview(LC1800_Ing, zcol = "COVERTYPE", col.regions = brewer.pal(12, "Dark2")) + mapview(Lansing, col.regions = "black")

Above we can see our dynamic map with the shapefile of Lake Lansing plotted in black. We first might ask what the area that is now Lake Lansing looked like in 1800. Many times, we need to first turn our polygons into raster data in order to do computations. Raster calculations are often less costly for large computations. We can “rasterize” our data above, turning it into pixel data with associated values.

ext <- extent(LC1800_Ing) # Get the extent of our Ingham county shapefile
raster_1800 <- raster(ext) # Create an empty raster for the same extent
raster_1800 <- rasterize(LC1800_Ing, raster_1800, field="VEGCODE", ) # Fill this empty raster with our numeric VEGCODE
mapview(raster_1800, col.regions = brewer.pal(12, "Dark2")) # Plot with mapview

Each number here corresponds to a single VEGCODE for a covertpye. Note how low resolution this is. We need to specify a finer resolution pixel or else we won’t be able to accurately calculate our covertypes.

ext <- extent(LC1800_Ing)
raster_1800 <- raster(ext, resolution = 0.001)
raster_1800 <- rasterize(LC1800_Ing, raster_1800, field="VEGCODE")
mapview(raster_1800, col.regions = brewer.pal(12, "Dark2"))

We then want to count up all the pixels in the raster for each VEGCODE based on the 1800 data that are within the current shapefile of Lake Lansing. We can do this using the extract() function, which will extract each pixel. Then, we can make a table of these pixel types to count them up. There are shortcuts for doing this, but I think it’s good to see how these calculations work under the hood.

extract(raster_1800, Lansing)
## [[1]]
##   [1]   52   52   52   52   52   52   52   52   52   52   52   52   52   52   52
##  [16]   52   52   52   52   52   52   52   52   52   52   52   52   52   52   52
##  [31]   52   52   52   52   52   52   52   52   52   52   52   52   52   52   52
##  [46]   52   52   52   52   52   52   52   52   52   52   52   52   52   52   52
##  [61]   52   52   52   52   52   52   52   52   52   52   52   52   52   52   52
##  [76]   52   52   52   52   52   52   52   52   52   52   52   52   52   52   52
##  [91]   52   52   52   52   52   52   52   52   52   52   52   52   52   52   52
## [106]   52   52   52   52   52   52   52   52   52   52   52   52   52   52   52
## [121]   52   52   52   52   52   52   52   52   52   52   52   52   52   52   52
## [136]   52   52   52   52   52   52   52   52   52   52   52   52   52   52   52
## [151]   52   52   52 4233   52   52   52   52   52   52   52   52   52   52   52
## [166]   52   52   52   52   52   52   52   52   52   52   52   52   52   52   52
## [181]   52   52   52   52   52   52   52   52   52   52   52   52   52   52   52
## [196]   52   52   52   52
table(extract(raster_1800, Lansing))
## 
##   52 4233 
##  198    1

We can see that 198 pixels from our raster were coded “52” and 1 was coded “4233”. We can create our own lookup table by looking for the unique combinations of VEGCODE and COVERTYPE.

print(LC1800_Ing[!duplicated(LC1800_Ing$VEGCODE),c('VEGCODE','COVERTYPE')])
## Simple feature collection with 20 features and 2 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: -84.60312 ymin: 42.49135 xmax: -84.15745 ymax: 42.77664
## Geodetic CRS:  WGS 84
## First 10 features:
##       VEGCODE                  COVERTYPE                       geometry
## 46701    6227                WET PRAIRIE POLYGON ((-84.1597 42.77663...
## 46702    4121   BEECH-SUGAR MAPLE FOREST POLYGON ((-84.18269 42.7765...
## 46703     414       MIXED HARDWOOD SWAMP POLYGON ((-84.1955 42.77629...
## 46708    4122         OAK-HICKORY FOREST POLYGON ((-84.24962 42.7761...
## 46709    4233        MIXED CONIFER SWAMP POLYGON ((-84.25036 42.7761...
## 46710      52                 LAKE/RIVER POLYGON ((-84.2526 42.77619...
## 46712    6221 SHRUB SWAMP/EMERGENT MARSH POLYGON ((-84.2637 42.77615...
## 46714     332           BLACK OAK BARREN POLYGON ((-84.27583 42.7760...
## 46721    4123           MIXED OAK FOREST POLYGON ((-84.33789 42.7758...
## 46724    6121                 MUSKEG/BOG POLYGON ((-84.30954 42.7744...

We can see that VEGCODE 52 is LAKE/RIVER and VEGCODE 4233 is MIXED CONIFER SWAMP. So, Lake Lansing was around in more or less its current shape in 1800 according to this historical estimate. Almost all 1800 pixels were coded as LAKE/RIVER (VEGCODE 52)

We can also work directly with the the shapefiles instead of making pixels. This can be more useful for cases like ours, but sometimes computations are faster when working with rasters if we are working with lots of data and locations. This function st_intersection() creates a new vector of the overlap between our Lake Lansing shapefile and our vegetative shapefile.

LC_Lansing <- st_intersection(Lansing, LC1800_Ing)
mapview(LC_Lansing, zcol = "COVERTYPE", col.regions = brewer.pal(12, "Dark2"))

We can see that without simplifying to pixels, we see some additional landcovers that we missed before. But, we can still see that there hasn’t been a significant change according to these data.

Water Quality Portal Data

Next, let’s bring in data from the Water Quality Portal (WQP) API using the USGS dataRetrieval package. You can check out their own tutorial here. First, we will need to determine where we want to draw data from. To keep it simple, let’s create a bounding box (4 coordinate square or rectangle) around Lake Lansing using the function st_bbox:

Lansing_bbox <- st_bbox(Lansing)
Lansing_bbox
##      xmin      ymin      xmax      ymax 
## -84.41154  42.74912 -84.39063  42.76718

You can see that the minimum/maximum latitudes and longitudes have been calculated.

We can then ask the WQP API what sampling sites are found in that bounding box using the function whatWQPsites() as well as what water quality parameter we want. You can check our the function documentation by putting ?whatWQPsamples in the console to see the other ways you can request data from this API such as HUC8 or county. Note that APIs will sometimes time out or be slower at times if they have many simultaneous requests. Since we know Secchi disk depth is very common, let’s try that:

Lansing_sites <- whatWQPsites(
  bBox = Lansing_bbox,
  characteristicName = "Depth, Secchi disk depth"
)
head(Lansing_sites)
## # A tibble: 3 × 37
##   OrganizationIdentifier OrganizationFormalName           MonitoringLocationId…¹
##   <chr>                  <chr>                            <chr>                 
## 1 USGS-MI                USGS Michigan Water Science Cen… USGS-424502084240705  
## 2 USGS-MI                USGS Michigan Water Science Cen… USGS-424547084241605  
## 3 NALMS                  North American Lake Management … NALMS-F630143         
## # ℹ abbreviated name: ¹​MonitoringLocationIdentifier
## # ℹ 34 more variables: MonitoringLocationName <chr>,
## #   MonitoringLocationTypeName <chr>, MonitoringLocationDescriptionText <chr>,
## #   HUCEightDigitCode <chr>, DrainageAreaMeasure.MeasureValue <dbl>,
## #   DrainageAreaMeasure.MeasureUnitCode <chr>,
## #   ContributingDrainageAreaMeasure.MeasureValue <dbl>,
## #   ContributingDrainageAreaMeasure.MeasureUnitCode <chr>, …

Assuming that the data hasn’t changed, you should see 3 sampling locations: 2 from the USGS MI Water Science Center and 1 from the North American Lake Management Society. Each sampling location will have a unique MonitoringLocationIdentifier. We can also see that all locations are classified as lake and we can see that the CRS is NAD83 (as opposed to the WGS84 of our map before). We will deal with that in a bit.

First, let’s use the readWQPdata() function to tell the WQP API what sites we want to call data from using the MonitoringLocationIdentifier values from above and again specifying that we want Secchi disk depth. The first function above got us the site information and this next function readWQPdata gets us the measured water quality values.

Lansing_Secchi <- readWQPdata(siteNumbers = Lansing_sites$MonitoringLocationIdentifier,
                        characteristicName = "Depth, Secchi disk depth")
Lansing_Secchi
##   OrganizationIdentifier                 OrganizationFormalName
## 1                USGS-MI     USGS Michigan Water Science Center
## 2                USGS-MI     USGS Michigan Water Science Center
## 3                USGS-MI     USGS Michigan Water Science Center
## 4                USGS-MI     USGS Michigan Water Science Center
## 5                  NALMS North American Lake Management Society
## 6                  NALMS North American Lake Management Society
##   ActivityIdentifier ActivityTypeCode ActivityMediaName
## 1 nwismi.01.00603035   Sample-Routine             Water
## 2 nwismi.01.00601191   Sample-Routine             Water
## 3 nwismi.01.00603067   Sample-Routine             Water
## 4 nwismi.01.00601164   Sample-Routine             Water
## 5  NALMS-2012_SECCHI    Field Msr/Obs             Water
## 6 NALMS-28314_SECCHI    Field Msr/Obs             Water
##   ActivityMediaSubdivisionName ActivityStartDate ActivityStartTime.Time
## 1                Surface Water        2006-08-24               10:15:00
## 2                Surface Water        2006-04-06               08:00:00
## 3                Surface Water        2006-08-24               09:15:00
## 4                Surface Water        2006-04-06               08:40:00
## 5                Surface Water        1996-07-04                   <NA>
## 6                Surface Water        2002-07-14                   <NA>
##   ActivityStartTime.TimeZoneCode ActivityEndDate ActivityEndTime.Time
## 1                            EST            <NA>                 <NA>
## 2                            EST            <NA>                 <NA>
## 3                            EST            <NA>                 <NA>
## 4                            EST            <NA>                 <NA>
## 5                           <NA>            <NA>                 <NA>
## 6                           <NA>            <NA>                 <NA>
##   ActivityEndTime.TimeZoneCode ActivityDepthHeightMeasure.MeasureValue
## 1                         <NA>                                       3
## 2                         <NA>                                       3
## 3                         <NA>                                       3
## 4                         <NA>                                       3
## 5                         <NA>                                      NA
## 6                         <NA>                                      NA
##   ActivityDepthHeightMeasure.MeasureUnitCode
## 1                                       feet
## 2                                       feet
## 3                                       feet
## 4                                       feet
## 5                                       <NA>
## 6                                       <NA>
##   ActivityDepthAltitudeReferencePointText
## 1                                    <NA>
## 2                                    <NA>
## 3                                    <NA>
## 4                                    <NA>
## 5                                    <NA>
## 6                                    <NA>
##   ActivityTopDepthHeightMeasure.MeasureValue
## 1                                         NA
## 2                                         NA
## 3                                         NA
## 4                                         NA
## 5                                         NA
## 6                                         NA
##   ActivityTopDepthHeightMeasure.MeasureUnitCode
## 1                                          <NA>
## 2                                          <NA>
## 3                                          <NA>
## 4                                          <NA>
## 5                                          <NA>
## 6                                          <NA>
##   ActivityBottomDepthHeightMeasure.MeasureValue
## 1                                            NA
## 2                                            NA
## 3                                            NA
## 4                                            NA
## 5                                            NA
## 6                                            NA
##   ActivityBottomDepthHeightMeasure.MeasureUnitCode  ProjectIdentifier
## 1                                             <NA>               <NA>
## 2                                             <NA>               <NA>
## 3                                             <NA>               <NA>
## 4                                             <NA>               <NA>
## 5                                             <NA> NALMS_SECCHI_DIPIN
## 6                                             <NA> NALMS_SECCHI_DIPIN
##                  ActivityConductingOrganizationText
## 1 U.S. Geological Survey-Water Resources Discipline
## 2 U.S. Geological Survey-Water Resources Discipline
## 3 U.S. Geological Survey-Water Resources Discipline
## 4 U.S. Geological Survey-Water Resources Discipline
## 5                                              MLSA
## 6                                              CLMP
##   MonitoringLocationIdentifier                            ActivityCommentText
## 1         USGS-424502084240705                                           <NA>
## 2         USGS-424547084241605                                           <NA>
## 3         USGS-424547084241605                                           <NA>
## 4         USGS-424502084240705                                           <NA>
## 5                NALMS-F630143                                  Old MLID=3232
## 6                NALMS-F630143 Old MLID=3232. Site=Deep Basin  (517) 339-3807
##   SampleAquifer HydrologicCondition HydrologicEvent
## 1          <NA>      Not applicable  Routine sample
## 2          <NA>      Not applicable  Routine sample
## 3          <NA>      Not applicable  Routine sample
## 4          <NA>      Not applicable  Routine sample
## 5          <NA>                <NA>            <NA>
## 6          <NA>                <NA>            <NA>
##   SampleCollectionMethod.MethodIdentifier
## 1                                    USGS
## 2                                    USGS
## 3                                    USGS
## 4                                    USGS
## 5                             SECCHI_DISK
## 6                             SECCHI_DISK
##   SampleCollectionMethod.MethodIdentifierContext
## 1                                           USGS
## 2                                           USGS
## 3                                           USGS
## 4                                           USGS
## 5                                          NALMS
## 6                                          NALMS
##   SampleCollectionMethod.MethodName SampleCollectionEquipmentName
## 1                              USGS                       Unknown
## 2                              USGS                       Unknown
## 3                              USGS                       Unknown
## 4                              USGS                       Unknown
## 5                       Secchi Disk                   Secchi Disk
## 6                       Secchi Disk                   Secchi Disk
##   ResultDetectionConditionText       CharacteristicName
## 1                         <NA> Depth, Secchi disk depth
## 2                         <NA> Depth, Secchi disk depth
## 3                         <NA> Depth, Secchi disk depth
## 4                         <NA> Depth, Secchi disk depth
## 5                         <NA> Depth, Secchi disk depth
## 6                         <NA> Depth, Secchi disk depth
##   ResultSampleFractionText ResultMeasureValue ResultMeasure.MeasureUnitCode
## 1                     <NA>              2.700                             m
## 2                     <NA>              1.800                             m
## 3                     <NA>              2.700                             m
## 4                     <NA>              1.800                             m
## 5                     <NA>              1.829                             m
## 6                     <NA>              2.743                             m
##   MeasureQualifierCode ResultStatusIdentifier StatisticalBaseCode
## 1                 <NA>            Preliminary                <NA>
## 2                 <NA>            Preliminary                <NA>
## 3                 <NA>            Preliminary                <NA>
## 4                 <NA>            Preliminary                <NA>
## 5                 <NA>                  Final                <NA>
## 6                 <NA>                  Final                <NA>
##   ResultValueTypeName ResultWeightBasisText ResultTimeBasisText
## 1              Actual                  <NA>                <NA>
## 2              Actual                  <NA>                <NA>
## 3              Actual                  <NA>                <NA>
## 4              Actual                  <NA>                <NA>
## 5              Actual                  <NA>                <NA>
## 6              Actual                  <NA>                <NA>
##   ResultTemperatureBasisText ResultParticleSizeBasisText PrecisionValue
## 1                       <NA>                        <NA>           <NA>
## 2                       <NA>                        <NA>           <NA>
## 3                       <NA>                        <NA>           <NA>
## 4                       <NA>                        <NA>           <NA>
## 5                       <NA>                        <NA>           <NA>
## 6                       <NA>                        <NA>           <NA>
##                                                             ResultCommentText
## 1                                                                        <NA>
## 2                                                                        <NA>
## 3                                                                        <NA>
## 4                                                                        <NA>
## 5                                                                        <NA>
## 6 Viewscope: No; Shady/sunny: Shady; Sunglasses: Yes; Reading Location: Boat;
##   USGSPCode ResultDepthHeightMeasure.MeasureValue
## 1     00078                                    NA
## 2     00078                                    NA
## 3     00078                                    NA
## 4     00078                                    NA
## 5      <NA>                                    NA
## 6      <NA>                                    NA
##   ResultDepthHeightMeasure.MeasureUnitCode
## 1                                     <NA>
## 2                                     <NA>
## 3                                     <NA>
## 4                                     <NA>
## 5                                     <NA>
## 6                                     <NA>
##   ResultDepthAltitudeReferencePointText SubjectTaxonomicName
## 1                                  <NA>                 <NA>
## 2                                  <NA>                 <NA>
## 3                                  <NA>                 <NA>
## 4                                  <NA>                 <NA>
## 5                                  <NA>                 <NA>
## 6                                  <NA>                 <NA>
##   SampleTissueAnatomyName ResultAnalyticalMethod.MethodIdentifier
## 1                    <NA>                                    <NA>
## 2                    <NA>                                    <NA>
## 3                    <NA>                                    <NA>
## 4                    <NA>                                    <NA>
## 5                    <NA>                                    <NA>
## 6                    <NA>                                    <NA>
##   ResultAnalyticalMethod.MethodIdentifierContext
## 1                                           <NA>
## 2                                           <NA>
## 3                                           <NA>
## 4                                           <NA>
## 5                                           <NA>
## 6                                           <NA>
##   ResultAnalyticalMethod.MethodName MethodDescriptionText LaboratoryName
## 1                              <NA>                  <NA>           <NA>
## 2                              <NA>                  <NA>           <NA>
## 3                              <NA>                  <NA>           <NA>
## 4                              <NA>                  <NA>           <NA>
## 5                              <NA>                  <NA>           <NA>
## 6                              <NA>                  <NA>           <NA>
##   AnalysisStartDate ResultLaboratoryCommentText
## 1              <NA>                        <NA>
## 2              <NA>                        <NA>
## 3              <NA>                        <NA>
## 4              <NA>                        <NA>
## 5              <NA>                        <NA>
## 6              <NA>                        <NA>
##   DetectionQuantitationLimitTypeName
## 1                               <NA>
## 2                               <NA>
## 3                               <NA>
## 4                               <NA>
## 5                               <NA>
## 6                               <NA>
##   DetectionQuantitationLimitMeasure.MeasureValue
## 1                                             NA
## 2                                             NA
## 3                                             NA
## 4                                             NA
## 5                                             NA
## 6                                             NA
##   DetectionQuantitationLimitMeasure.MeasureUnitCode PreparationStartDate
## 1                                              <NA>                 <NA>
## 2                                              <NA>                 <NA>
## 3                                              <NA>                 <NA>
## 4                                              <NA>                 <NA>
## 5                                              <NA>                 <NA>
## 6                                              <NA>                 <NA>
##   ProviderName timeZoneStart timeZoneEnd ActivityStartDateTime
## 1         NWIS             5          NA   2006-08-24 15:15:00
## 2         NWIS             5          NA   2006-04-06 13:00:00
## 3         NWIS             5          NA   2006-08-24 14:15:00
## 4         NWIS             5          NA   2006-04-06 13:40:00
## 5       STORET            NA          NA                  <NA>
## 6       STORET            NA          NA                  <NA>
##   ActivityEndDateTime
## 1                <NA>
## 2                <NA>
## 3                <NA>
## 4                <NA>
## 5                <NA>
## 6                <NA>

We can see that among these 3 sampling sites there are 6 Secchi disk depth measurements. We also have information on the date of the sampling and time. We will use the ResultMeasureValue column that has the value we want in meters (we can see the units in the next column ResultMeasure.MeasureUnitCode). Of course, it’s always good to double check that the units are all the same so let’s do that first.

unique(Lansing_Secchi$ResultMeasure.MeasureUnitCode)
## [1] "m"

Great, all the same units so we can proceed with analyzing the data. There’s not much data here so let’s just look at the mean of Secchi.

mean(Lansing_Secchi$ResultMeasureValue)
## [1] 2.262

We get a Secchi disk depth of 2.262 meters.

Finally, let’s map the sampling locations on to our lake shapefile. First, we have to create our spatial data points by specifying the latitude and longitude from our Lansing_sites data object:

Lansing_sf <- st_as_sf(Lansing_sites, coords = c('LongitudeMeasure', 'LatitudeMeasure'))

But, wait! We haven’t set the Coordinate Reference System. Let’s inspect the CRS using the function crs()

crs(Lansing_sf)
## [1] ""

You can see that it is empty since we haven’t set it. Each CRS is represented by a unique EPSG code number which we can use. The EPSG for NAD83 (our WQP data) is 4269 and the EPSG for WGS84 (our Lansing shapefile) is 4326. Let’s set it using st_set_crs. We will do this correctly and then also set the CRS incorrectly to the data by setting the same data to EPSG 4268 (NAD27 Michigan) to see what happens if the CRS is mispecified.

Lansing_sf_correctCRS <- st_as_sf(Lansing_sites, coords = c('LongitudeMeasure', 'LatitudeMeasure'), crs = 4269)
Lansing_sf_incorrectCRS <- st_as_sf(Lansing_sites, coords = c('LongitudeMeasure', 'LatitudeMeasure'), crs = 4268)

Let’s plot both sets of points on top of our Lake Lansing shapefile:

mapview(Lansing) + mapview(Lansing_sf_correctCRS, col.regions = "green") + mapview(Lansing_sf_incorrectCRS, col.regions = "red")

We can now see the three sampling locations where we have Secchi depth for Lake Lansing. You can also see that even when we specify the wrong CRS NAD27 Michigan vs. NAD83 we have about a 10 meter difference in location (you may need to zoom in). This wouldn’t affect much here but being off 10 meters can definitely matter for some use cases.

Exercise Questions (40 pts)

Now it’s your turn to do the same thing but with a different lake, Lake Ovid. You will also switch up some of the water quality parameters asked for from the API. If knitting is slow, you may want to # out the mapping code above so they are not generated by the report since I only need to see your new work.

Question 1 (10 pts)

First, go to the USGS National Map. Using the Watershed Boundary Dataset layer:

  • 1.1 What is the HUC2 of Lake Ovid? (1 point):

  • 1.2 What is the HUC6 of Lake Ovid? (1 point):

  • 1.3 What is the HUC8 of Lake Ovid? (1 point):

  • 1.4 What is the HUC12 of Lake Ovid? (1 point):

  • 1.5 Which of these 4 HUC level are shared by both Lake Ovid and Lake Lansing? (1 point):

Next, again using the USGS National Map, but switching to the National Hydrography Dataset:

  • 1.6 How many natural rivers inflow into Lake Ovid? (1 point):

  • 1.7 How many natural rivers outflow from Lake Ovid? (1 point):

  • 1.8 How many Canals/Ditches inflow into Lake Ovid? (1 point):

  • 1.9 How many Canals/Ditches outflow from Lake Ovid? (1 point):

  • 1.10 If you were to conduct a study on the nutrient balance of Lake Ovid, how might knowing the locations of the inflows and outflows influence your sampling design? (1 point; open-thought short response)

Question 2 (20 pts)

2.1 (2 points) Trim the original 1800 Michigan Land Cover map to just Clinton County, which includes our new lake of interest Lake Ovid. Lake Ovid is about a 30 minute drive north of campus and is within Sleepy Hollow State Park.

2.2 (2 points) Read in the shapefile of Lake Ovid called ‘Lake_Ovid.shp’. While you only need to call in the .shp extension shapefile, the other files associated with it that specify the projection and data must also be in the file folder to read in successfully.

2.3 (4 points) Make a mapview plot of Clinton County COVERTYPE (2 points) with Lake Ovid displayed on top (2 points). You may use any coloring scheme as long as each covertype is uniquely colored.

2.4 (6 points) Create a raster of Clinton county VEGCODE with a resolution of 0.001. [Hint: You will need to remove NA VEGCODES and use as.numeric() to ensure this data is represented numerically for the raster function. You can follow the same procedure as provided above]

2.5 (2 points) Make a table of the raster values by VEGCODE using the function freq() (1 point) and a VEGCODE lookup table to see what VEGCODE correspond to each COVERTYPE.

2.6 (2 points) What percent of Clinton County was Mixed Hardwood Swamp in 1800?

2.7 (2 points) Create a Vector format map of the intersection of 1800 Clinton County vegetation and today’s Lake Ovid.

  • What percent of Lake Ovid was water in 1800?

Question 3 (10 pts)

3.1 (2 points) Read in the Lake Ovid shapefile (1 point) and create an object representing the bounding box around Lake Ovid (1 point)

3.2 (2 points) Use the USGS dataRetrieval package to find the sampling sites in the Water Quality Portal for Lake Ovid that have the charactersiticName = “Phosphorus”.

How many unique monitoring locations are there for Lake Ovid phosphorus? (1 point):

What are the name(s) of the organizations that have sampled lake Ovid in the WQP? (1 point):

3.3 (2 points) Use the USGS dataRetrieval package bring the Lake Ovid phosphorus into your R session from the WQP API.

  • What units is phosphorus being sampled as? (1 point):

  • What is the average total phosphorus with units in Lake Ovid in 2002? (1 point):

3.4 (4 points) Check out the Michigan Clean Water Corp 2021 report on Lake Ovid.

  • Is the phosphorus data from the Michigan Clean Water Corp for Lake Ovid in the WQP? (1 point)

  • What is the mean total phosphorus from your WQP API pull in ppb? (1 point)

  • According to the Trophic Status Index table in the Michigan Clean Water Corp report, what trophic status category was Lake Ovid in 2002 based on phosphorus? (1 point):

  • How has the trophic status of Lake Ovid changed between 2002 and this 2021 report? (1 point):